Evaluating and correcting phoneme segmentation for unit selection synthesis

نویسندگان

  • John Kominek
  • Christina L. Bennett
  • Alan W. Black
چکیده

As part of improved support for building unit selection voices, the Festival speech synthesis system now includes two algorithms for automatic labeling of wavefile data. The two methods are based on dynamic time warping and HMM-based acoustic modeling. Our experiments show that DTW is more accurate 70% of the time, but is also more prone to gross labeling errors. HMM modeling exhibits a systematic bias of 15 ms. Combining both methods directs human labelers towards data most likely to be problematic.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syllable Specific Unit Selection Cost Function Using a Tone Modeling Technique for Automatic Phonetic Segmentation of Hindi Speech Using HMM

This paper presents a technique of improving tone correctness in speech synthesis of a tonal language based on an average-voice model trained with a corpus from nonprofessional speakers speech. Unit selection-based concatenative synthesis is one of the widely used speech synthesis approaches. This approach overcomes the limitations of other synthesis techniques such as articulatory synthesis an...

متن کامل

Automatic error detection in alignments for speech synthesis

The phonetic segmentation of recorded speech is a crucial factor in the quality of concatenative systems for speech synthesis. We describe a a likelihood-based error detection process that can be used to flag possible errors in such a segmentation, with a view towards manual correction. It is shown that this process can be used to assist in the creation of high-accuracy segmentations. In partic...

متن کامل

Automatic Phoneme Segmentation with Relaxed Textual Constraints

Speech synthesis by unit selection requires the segmentation of a large single speaker high quality recording. Automatic speech recognition techniques, e.g. Hidden Markov Models (HMM), can be optimised for maximum segmentation accuracy. This paper presents the results of tuning such a phoneme segmentation system. Firstly, using no text transcription, the design of an HMM phoneme recogniser is o...

متن کامل

Fully automatic segmentation for prosodic speech corpora

While automatic methods for phonetic segmentation of speech can help with rapid annotation of corpora, most methods rely either on manually segmented data to initially train the process or manual post-processing. This is very time-consuming and slows down porting of speech systems to new languages. In the context of prosody corpora for text-to-speech (TTS) systems, we investigated methods for f...

متن کامل

Automatic Speech Segmentation Based on HMM

This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003